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Summary 

The  distribution  of  error  is  considered  when  a  function  y  of  x  is 
rounded,  and  when  x  is  uniformly  distributed.  The  example  discussed  is 
y  -  sin  x,  and  it  is  thought  that  the  round-off  error  might  be  nearly 


uniformly  distributed. 

The  non-uniformity  is  very  small,  and  the  sample  size  needed  to  detect 
2 

this  by  the  A  statistic  is  examined.  The  study  is  of  interest  in  the 
examination  of  ancient  and  mediaeval  tables. 


Key  Words:  distribution  of  error:  goodness-of-fit;  tables  of  functions; 
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1  INTRODUCTION 

This  problem  was  brought  to  the  Statistical  Consulting  Service  at 
Simon  Fraser  University  by  G.  van  Brummelen,  a  graduate  student  in  the 
History  of  Mathematics.  It  contains  interesting  probabilistic  features  and 
a  statistical  application  which  appear  to  be  worth  recording. 

The  problem  concerns  the  distribution  of  error  t  when  a  value  of  y,  a 
function  of  x,  is  rounded,  say  to  2  d.p.  The  distribution  is  the  sum  of 
many  terms,  and  at  first  sight  it  may  appear  to  be  approximately  unifoim: 
this  brings  in  the  statistical  application,  to  examine  how  one  would  detect 
that  it  is  non-uniform. 

Mr.  van  Brummelen  describes  the  origin  of  the  problem  as  follows: 

Many  ancient  and  medieval  astronomical  treatises  contain 
numerical  tables  which  allow  the  reader  to  calculate  planetary 
positions  and  related  phenomenae.  The  formulae  implicit  in  these 
tables  are  given,  but  the  errors  in  the  tabular  values  do  not 
usually  reflect  what  one  would  expect  from  a  straightforward 
computation.  This  may  be  due  to  the  use  of  interpolation  or 
other  timesaving  techniques,  or  to  varying  levels  of  rounding. 

In  order  to  determine  the  calculation  methods  used  by  the  author 
of  a  table,  I  am  developing  (have  developed)  several  numerical 
and  statistical  tests.  These  are  designed  to  search  for 
interpolation  grids,  check  for  dependence  of  one  table  on 
another,  and  find  an  error  distribution  given  an  hypothesized 
calculation  method,  for  example.  Many  of  the  tests  require  the 
assumption  that  the  error  caused  by  rounding  a  set  of  computed 
values  to  some  level  is  nearly  uniformly  distributed.  It  is  this 
assumption  that  I  wish  to  verify. 
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2.  DISTRIBUTION  OF  ERROR  for  7  -  sin  x. 

We  examine  the  distribution  of  round-off  error  for  the  function 
y  -  sin  x;  the  value  of  x  is  considered  to  be  uniformly  distributed  in  the 
interval  0  £  x  &  n/2.  Suppose  y  is  rounded  to  accuracy  A;  in  what  follows 
we  assume  A  -  0.01.  Then  the  error  in  y  is  found  as  follows.  Suppose  y 
is  rounded  to  iA,  for  i-l,...,n-l;  the  true  value  must  have  been  y  ,  such 


that  yx  <  y  sy2 


where 


y^  -  ( iA  -  A/2 )  and  y^  -  ( iA  +  A/2 ) .  These 


correspond  to  x^  -  sin  *y^  and  X£^  -  sin  *  y^\  call  the  interval 
(x^  <  x  £  x2i^  the  **th  interval.  The  error  in  y  is 

€  -  y  -  sin  x  (1) 

Suppose  F(t)  is  the  distribution  of  «:  that  is,  F(t)  -  P(e  <  t) .  The 
contribution  to  F(t)  from  the  i-th  interval,  for  1  <  i  <  n-1,  is 

Fi<t)  -  l  {sin*1(iA  +  A/2)  -  sin'VA-t)}  ,  -  |  S  t  Z  |  .  (2) 

The  top  and  bottom  intervals  are  special  cases.  For  0  5  x  s  sin**  (A/2)  the 
error  is  negative,  and  the  contribution  to  F(t)  is 

Fg(t)  -  2{sin**(A/2)  -  sin**(t) }/*;  for  sin**(l-A/2)  £  x  S  jt/2,  the  error  is 
positive  and  the  contribution  to  F(t)  is  Fn(t)  -  2lir/2  -  sin’*(l-t) )/*. 
When  these  are  put  together  we  have  finally,  with  nA  -  x/2 , 

F(t)  -  -  l  {sin  a(1A+A/2)  -  sin’A(iA-t))  +  sin'1  A/2  +  sin*A(t)  , 

*  i-1 

-  A/2  5  t  s  0; 


2  ^  •  1  1  _  1  1 

F(t)  -  ;  l  (sin  (iA+A/2)  -  sin*  (iA-t))  +  £  -  sin*1(l-t)  +  sin*1  A/2  , 


0  S  t  S  A/2 


The  density  is 
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f<t)  -  l  Vl/d  -  <iA-t)2} 1/2  +  l/(l-t2)1/2 

[i-i 

and 

f(t)  -  \  nill/d-(iA-t)2)1/2  +  l/(2t-t2)1  |  ,  0<ts  A/2; 

[i-1  J 

The  density  has  an  infinite  value  as  t  approaches  zero  from  above.  Thus, 

it  is  certainly  not  uniform,  but  through  much  of  the  range  it  will  be  close 

to  uniform.  Table  1  gives  values  of  F(«)  and  f(t)  for  a  range  of  values 
of  € ,  when  A  -  0.01.  Figures  la,  lb  are  plots  of  F(e)  and  f(e)  from 
Table  1,  and  Figures  2a,  2b  are  similar  plots  on  a  larger  scale,  to  show  the 
sharp  change  in  density  at  «-0. 

3.  THE  DETECTION  OF  NON-UNIFORMITY 

The  following  statistical  problem  can  then  be  posed.  Suppose  U(a,b) 

denotes  the  uniform  distribution  between  a,  b,  and  suppose  a  sample  of 

size  N  is  taken  from  F(t).  How  large  must  N  be  in  order  to  reject 

H  ;  the  errors  e  are  U(-  A/2,  A/2)  ? 
o 

The  size  of  N  will  clearly  depend  on  the  statistic  used:  we  have  examined  a 

statistic  which  is  generally  accepted  to  be  powerful  for  such  a  test,  namely 

2 

the  EDF  statistic  A  (for  the  definition  and  tables,  see  Stephens,  1936). 

Table  2  gives  the  number  of  100  Monte  Carlo  samples  which  were 

detected  as  significant  by  this  statistic,  using  samples  of  size  N.  The 

percentages  are  given  for  several  test  sizes  a.  Three  sets  of  samples  of 

size  2000,  and  two  of  sizes  5000  and  10000  were  included  to  show  the 

2 

variability  in  power  of  A  to  detect  the  non-uniformity.  The  table  shows 
that  even  with  2000  values,  a  5%  test  would  detect  this  delicate  departure 
from  uniformity  only  about  20  times  in  100;  the  sample  size  must  go  to  10,000 
to  find  a  power  of  over  85%.  Thus  one  can  suppose  the  error  distribution 
will  appear  uniform  to  many  observers  and  can  probably  be  treated  as  such  for 


* 

,  -  A/2  S  t  3  0; 


4 


many  statistical  purposes. 
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Values  of  i 

/ 

Table  1 

distribution  and  density  of  e 

e 

F(e) 

f  (e) 

-0.0050 

0.0000 

97.28 

-0.0047 

0.0243 

97. C3 

-0.0045 

0.0485 

96.80 

-0.0042 

0.0727 

96.57 

■ 

-0.0040 

0.0968 

96.36 

-0.0038 

0.1209 

96.15 

-0.0035 

0.1449 

95.95 

-0.0033 

0.1688 

95.76 

-0.0030 

0.1928 

95.58 

-0.0028 

0.2166 

95.40 

-0.0025 

0.2405 

95.23 

-0.0023 

0.2642 

95.06 

-0.0020 

0.2880 

94.90 

* 

-0.0018 

0.3117 

94.74 

-0.0015 

0.3354 

94.59 

-0.0013 

0.3590 

94.44 

-0.0010 

0.3826 

94.29 

-0.0008 

0.4061 

94.15 

-0.0005 

0.4297 

94.01 

-0.0003 

0.4531 

93.87 

0.0000 

0.4766 

93.74 

0.0002 

0.5141 

121.45 

0.0005 

0.5432 

112.98 

0.0007 

0.5709 

109.16 

0.0010 

0.5979 

106.84 

0.0012 

0.6244 

105.22 

0.0015 

0.6506 

103.99 

0.0017 

0.6764 

103.01 

0.0020 

0.7021 

102.21 

0.0022 

0.7275 

101.52 

0.0025 

0.7528 

100.92 

0.0027 

0.7780 

100.40 

. 

0.0030 

0.8030 

99.92 

0.0032 

0.8280 

99.50 

0.0035 

0.8528 

99.11 

0.0037 

0.8775 

98.75 

0.0040 

0.9022 

98.41 

1 

0.0042 

0.9267 

98.10 

H 

0.0045 

0.9512 

97.81 

I  j 

0.0047 

0.9756 

97.54 

1 

0.0050 

1.0000 

97.28 

\  .  , 

1 

V.  \  ,  -•  - _  -  •  • 

\  --  "■ 

j 

_ _ \  A.  •- _ 1_  ' 

>.  .  .. 

TABLE  2 


The  Cables  gives  Che  number  of  MonCe  Carlo  samples,  each  of  size  N,  which 

2 

give  a  significant  value  of  A  aC  level  a.  The  number  of  Monte  Carlo 
samples  generated,  for  each  N,  was  100. 


n\ci:  0.25 

0.10 

0.05 

0.01 

2000 

60 

31 

16 

4 

2000 

56 

32 

23 

6 

2000 

59 

36 

21 

6 

5000 

92 

77 

67 

25 

5000 

89 

67 

50 

23 

10000 

100 

96 

88 

69 

10000 

100 

99 

91 

63 

Density 


0.54- 


Density  function  of  £ 
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